28 research outputs found
Controllable Image Generation via Collage Representations
Recent advances in conditional generative image models have enabled
impressive results. On the one hand, text-based conditional models have
achieved remarkable generation quality, by leveraging large-scale datasets of
image-text pairs. To enable fine-grained controllability, however, text-based
models require long prompts, whose details may be ignored by the model. On the
other hand, layout-based conditional models have also witnessed significant
advances. These models rely on bounding boxes or segmentation maps for precise
spatial conditioning in combination with coarse semantic labels. The semantic
labels, however, cannot be used to express detailed appearance characteristics.
In this paper, we approach fine-grained scene controllability through image
collages which allow a rich visual description of the desired scene as well as
the appearance and location of the objects therein, without the need of class
nor attribute labels. We introduce "mixing and matching scenes" (M&Ms), an
approach that consists of an adversarially trained generative image model which
is conditioned on appearance features and spatial positions of the different
elements in a collage, and integrates these into a coherent image. We train our
model on the OpenImages (OI) dataset and evaluate it on collages derived from
OI and MS-COCO datasets. Our experiments on the OI dataset show that M&Ms
outperforms baselines in terms of fine-grained scene controllability while
being very competitive in terms of image quality and sample diversity. On the
MS-COCO dataset, we highlight the generalization ability of our model by
outperforming DALL-E in terms of the zero-shot FID metric, despite using two
magnitudes fewer parameters and data. Collage based generative models have the
potential to advance content creation in an efficient and effective way as they
are intuitive to use and yield high quality generations
Gaussian fields for semi-supervised regression and correspondence learning
Gaussian fields (GF) have recently received considerable attention for dimension reduction and semi-supervised classification. In this paper we show how the GF framework can be used for semi-supervised regression on high-dimensional data. We propose an active learning strategy based on entropy minimization and a maximum likelihood model selection method. Furthermore, we show how a recent generalization of the LLE algorithm for correspondence learning can be cast into the GF framework, which obviates the need to choose a representation dimensionality
The global k-means clustering algorithm
We present the global k-means algorithm which is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) executions of the k-means algorithm from suitable initial positions. We also propose modifications of the method to reduce the computational load without significantly affecting solution quality. The proposed clustering methods are tested on well-known data sets and they compare favorably to the k-means algorithm with random restarts
The Global k-Means Clustering Algorithm
We present the global k-means algorithm which is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) executions of the k-means algorithm from suitable initial positions. We also propose modifications of the method to reduce the computational load without significantly a#ecting solution quality. The proposed clustering methods are tested on well-known data sets and they compare favorably to the k-means algorithm with random restarts
Accelerated EM-based clustering of large data sets
Motivated by the poor performance (linear complexity) of the EM algorithm in clustering large data sets, and inspired by the successful accelerated versions of related algorithms like k-means, we derive an accelerated variant of the EM algorithm for Gaussian mixtures that: (1) offers speedups that are at least linear in the number of data points, (2) ensures convergence by strictly increasing a lower bound on the data log-likelihood in each learning step, and (3) allows ample freedom in the design of other accelerated variants. We also derive a similar accelerated algorithm for greedy mixture learning, where very satisfactory results are obtained. The core idea is to define a lower bound on the data log-likelihood based on a grouping of data points. The bound is maximized by computing in turn (i) optimal assignments of groups of data points to the mixture components, and (ii) optimal re-estimation of the model parameters based on average sufficient statistics computed over groups of data points. The proposed method naturally generalizes to mixtures of other members of the exponential family. Experimental results show the potential of the proposed method over other state-of-the-art acceleration techniques